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Past  studies  on  clinical  judgment  comparing  clinical  and 
actuarial  (statistical)  predictions  have  met  with  a  great  deal 
of  methodological  criticism.  This  criticism  seems  to  center  around 
the  issues  of  (l)  uncross-validated  statistical  formulas;  (2)  in- 
adequate criterion  measures;  (3)  misrepresentation  of  judges  as 
''clinicians";  (^1)  insuff icient  power  to  detect  differences  due 
to  small  sample  size;  and  (5)  use  of  data  unsuitable  for  clinical 
integration*  As  a  result,  it  has  been  concluded  by  a  number  of 
researchers  that  there  has  never  been  a  true  or  fair  test  of 
clinical  versus  actuarial  prediction.  In  order  to  arrive  at  a 
more  accurate  comparison  of  clinical  and  actuarial  prediction, 
thr-   present  research  calied  for  simultaneous  attempts  to  predict 
the  sane  criterion  from  the  same  data  by  clinicians  and  statis- 
ticiana  (computer),  .-riven  the  same  initial  information  and 
substantially  p.oine;  through  the  same  preliminary  steps.  This 


vi 
the  first  study  to  utilize  the  very  essential  require- 
t,  consistently  avoided  in  other  studies,  of  ero.-s-validation 
of  both  the  clinical  and  actuarial  prediction  systei 

The  cognitive  processes  of  the  clinician  engaged  in  a 
clinical  task  have  always  been  viewed  and  studied  independently 
of  the  accuracy  of  clinical  prediction  and  no  effort  was  ever 
made  to  look  at  the  relationship  b<      the  two.  Through  util- 
ization of  a  complex  series  of  formulas  derived  to  tap  the 
cognitive  components  of  clinical  inference,  the  cognitive 
processes  of  the  clinician  (linear  and  non-linear  components) 
and  their  relationship  to  accuracy  level  were  studied  in  the 
present  research.  In  addition,  the  variables  of  level  of  aware- 
ness, confidence  and  appropriateness  and  levels  of  experience 
and  expertise  were  also  considered. 

The  clinical  task  utilized  was  the  prediction  of  short  or 
long  term  stay  in  psychotherapy  from  KKPI  profiles.  The  clinical 
sample  consisted  of  222  cases,  two-thirds  of  which  were  randomly 
chosen  as  the  "standardization"  sample  with  the  remaining  one- 
third  as  the  "cross-validation"  sample.  Judge-,  were  14  cliflical 
empirically  divided  into  export  and.  non-cxpcrt 
oups  &nd  also  divided  into  high  and  low  experience  level  groups. 
The  same  initial  information  uas  given  to  both  the  clinician  and 
the  actuary  in  the  form  of  MMPI  scale  scores  and  criterion  in- 
fornation  for  the  standardization  sample.  Clinicians  were  given 
time  to  familiarize  themselves  with  this  data  while  the  actuarial 
analysis  resulted,  in  a  discriminant  function  prediction  equation. 


Both  the  clinical  and  actuarial  prediction  systems  were  tested  on 
the  cross-validation  sample.  Judge?  were  asked  to  indicate  the 
level  of  confidence  they  had  for  each  decision.  In  order  to  ob- 
tain a  measure  of  self -awareness,  they  were  also  asked  to  sub- 
jectively weight  each  of  the  predictor  variables  according  to 
the  importance  attached  to  eacn  in  making  decisions. 

Results  showed  13  out  of  14  clinical  judges  exceeding  the 
accuracy  obtained  by  the  actuarial  method,  A  significant  positive 
relationship  between  accuracy  level  and  amount  of  non-linear 
cognitive  functioning  was  found.  Significant  differences  were 
obtained  on  accuracy  level,  non-linear  cognitive  functioning  and 
level  of  awareness  between  expert  and  non-expert  judges.  Experts 
were  more  accurate,  more  non-linear  and  less  able  to  specify  their 
own  inference  behavior.  No  differences  were  found  between  high 
and  low  experience  level  judges. 

This  is  the  only  clinical  judgment  study  to  date  that  is  so 
overwhelmingly  in  favor  of  the  clinician.  Although  it  is  considered 
the  most  accurate  comparison  of  clinician  and  actuary,  certain  lim- 
itations were  present,  primarily  the  restriction  of  linearity  on 
the  actuarial  system.  This  becomes  more  evident  considering  the 
relationship  between  accuracy  and  non-linear  cognitive  functioning. 


CHAPTER  I 


INTRODUCTION 


Background 


Since  its  inception  as  a  profession,  clinical  psychology  has 
been  called  upon  to  aid  in  or  make  decisions  concerning  a  variety 
of  health  and  mental  health  problems.  As  these  demands  increase 
and  as  more  responsibility  is  designated  to  the  clinician,  pressure 
to  critically  and  systematically  examine  his  own  clinical 
activities  has  been  placed  upon  the  clinical  psychologist.  Fore- 
most among  these  activities  is  the  process  of  clinical  judgment 
or  clinical  inference.  A  review  of  the  literature  on  clinical 
judgment  indicates  that  past  studies  have  generally  dealt  either 
with  the  accuracy  of  clinical  versus  actuarial  (statistical) 
predictions  or  with  the  cognitive  processes  of  the  clinician. 
Furthermore,  these  areas  of  study  have  always  been  viewed  and 
treated  independently  of  each  other  and  no  effort  has  yet  been 
made  to  look  at  the  relationship  between  the  two.  In  addition, 
a  number  of  variables  such  as  levels  of  experience,  confidence, 
appropriateness,  level  of  awareness,  amount  and  type  of  inform- 
ation, accuracy  feedback,  etc.  have  also  been  of  primary  or 


secondary  interest  in  clinical  judgment  research*  The  research 
-to  -     r-wever     1  ?en  far  from  conclusive  and  has  met  with 
a  great  deal  of  methodological  criticism  (Harty,  1Q?J  ; 
Holt,  1953,  19?0;  Sawyer,  I966). 

The  purpose  of  the  research  reported  here,  the  first  in 
a  series  of  studies  on  clinical  judgment,  is  to  deal  with  such 
criticisms  and  "clean  up"  the  methodology  in  hopes  of  finally 
arriving  at  a  more  accurate  test  of  clinical  versus  actuarial 
prediction.  Secondly,  it  is  intended  to  look  at  the  cognitive 
processes  of  the  clinician  within  the  specific  clinical  task 
and  also  in  relation  to  clinical  accuracy.  The  variables  of 
level  of  awareness,  confidence  and  appropriateness  and  levels 
of  experience  and  expertise  will  also  be  considered  in  this 
■research, 

01  inical  Vrrs'is  Aquaria]  Prediction 
In  the  past  two  decades,  a  great  number  of  stud      ve 
been  conducted  comparing  clinical  and  actuarial  or  statistical 
prediction.  Probably  the  most  well-known  and  often  quoted 
survey  of  the  empirical  evidence  is  that  of  Heehl  (193^-)  who 
found,  "16  to  20  studies  involving  a  comparison  of  clinical 
and  actuarial  methods,  in  all  but  one  of  which  the  predictions 
mHde  actuarially  were  either  approximately  enual  or  supe  '•■  >- 
to  those  made  by  a  clinician"  [p.  11 9].  Later  Heehl  (I965) 
qualified  his  previous  findings  by  pointing  out  that  the  on^ 
study  favoring  the  clinician  did  so  because  of  a  spuriously 


high  chi      -  and  on  closer  inspection,  the  superiority  of  the 

Lnician  vanished.  At  the  same  ti-ne  he  reported  monitorj 
some  fifty  empirical  investigations  in  which  the  efficiency  of 
a  human  judge  in  combining  information  is  compared  with  a 
mechanical  or  statistical  procedure.  He  concludes  that,  "the 
current  'box  score'  shows  a  significantly  superior  predictive 
efficiency  for  the  statistical  method  in  about  two-thirds  r 
investigations,  and  a  substantially  equal  efficiency  in  1 
rest"  [p,  2?].  Cough  (1963)  came  to  very  similar  conolusic 
after  his  review  of  a  limited  number  of  prediction  studies  but 
fdded  that  he  felt  that,  "no  fully  adequate  study  of  the 
clinician's  forecasting  skills  has  been  carried  out"  [p.  582], 

By  far  the  post  comprehensive  and  systematic  survey  to 
date  has  been  i  Siwyer  (1966)  who  adduced  that  the  mode 

data  collection  could  be  clinical,  mechanical  or  both  and  th 
combiri      or  integration  could  b"  either  clinic?]  or  nechanlcal, 
resulting  in  six  predictive  methods.  To  these  he  added  clinical 
synthesis,  in  which  the  clinician  is  informed  of  the  actuarial 
prediction,  and  mechanical  synthesis,  in  which  the  clinician's 
prediction  is  added  to  the  formula.  After  classifying  and  com- 
paring  ^5  studies,  Sawyer  found,  "the  mechanical  mode  of  com- 
bination always  equal  or  superior  to  the  clinical  node"  [p.  I92"1. 
The  only  review  to  date  in  which  the  evidence  is  in  favor  of 
clinical  prediction  is  that  of  Korran  (1968)  who,  by  con- 
eentratlng  solely  on  the  criterion  of  nanagerial  performance, 
reviewed  over  40  prediction  studies  and  concluded,  "it  would 


there  is  no  bas*s  for  assuming  any  superiority  of 
rial  over  the  el     1  method  at  this  time.  In  fact 
the  evidence  is  to  the  contrary"  [p.  %6~\, 

Holt  (1958,  1970)  has  been  the  most  ardent  critic  of  the 
prediction  studies  and  feels  that  there  has  never  been  a  true  or 
fair  test  of  clinic?]  versus  statistical  prediction.  He  has 
criticized  predict'1      ties  on  n  number  of  issues,  the  most 
salient  ones  being  those  which  he  laid  against  the  studies  re- 
viewed by  Sawyer.  These  criticisms  arc  (l)  the  use  of  uncross- 
validated  formulas  in  statistical  predictions  based  on  multiple 
repression;  (?)   inadequate,  and  often  clinically  inappropriate 
criterion  measures;  (3)  misrepresentation  of  various  types  of 
judges  as  clinicians;  (k)   insufficient  power  to  detect  differ- 
ences due  to  in      -;.e  sample  size;  and  (5)  use  of  data  un- 
suitable or  inapplicable  for  c"linie*l  integration.  He  sees  the 
main  problem  however  as  not  having  simultaneous  attempts  to 
predict  the  same  criterion  from  the  same  data  by  clinicians 
and  statisticians  who  are  given  the  same  initial  information 
and  who  have  gone  through  the  same  preliminary  steps.  To 
correct  for  this,  research  would  have  to  utilize  a  very  specific 
design  outlined  as  follows.  Alter  study  of  the  criterion  behavior 
and  the  intervening  variables  affecting  it,  techniques  of 
measurement  would  be  determined  for  both.  These  measurements  or 
estimates  of  the  intervening  variables  would  then  be  considered 
the  predictor  variables.  The  predietor  variables  as  well  as  the 
criterion  behavior  would  be  measured  for  a  sample  of  cases  and 


both  these  measurements  would  be  given  to  the  clinician  for 
study  in  order  for  him  to  determine-;  the  pattern  of  relation- 
ships that  exist.  At  the  same  time  both  of  these  measurements 
would  also  be  turned  over  to  a  statistician  (computer)  who 
would  correlate  them  and  arrive  at  some  sort  of     ction 
equation.  The  critical  test  would  be  performed  on  a  crosy- 
validation  sample  with  both  the  clinician  and  the  statistician 
being  given  only  the  set  of  predictor  variable  measurements 
from  this  new  sample.  The  statistical  predictions  would  easily 
be  provided  by  entering  this  new  set  of  predictor  variable 
measurements  into  the  previously  obtained  prediction  equation 
while  clinical  predictions  would  he  provided  by  simply  having 
the  clinician  make  his  judgments,  Both  sets  of  predictions 
would  then  be  compared  with  the  criterion  for  accuracy.  Previous 
studies  have  consistently  avoided  this  very  essential  require- 
ment of  cross-validation  of  both  the  actuarial  and  clinical 
prediction  systems.  It  is  only  by  following  this  type  of  design 
that  a  veritable  comparison  of  clinical  versus  actuarial  pre- 
diction can  be  made, 

If  indeed  it  should  be  shown  that  the  predictions  of  the 
clinician  are  either  more  or  less  accurate  than  those  of  the 
actuary,  one  is  still  left  with  the  task  of  explaining  why_ 
this  is  so.  The  explanation  of  such  findings  would  apparently 
entail  looking  at  the  cognitive  processes  of  the  clinician 
engaged  in  the  prediction  task.  Past  studies  of  clinical 
accuracy  seem  to  have  by-passed  this  issue  entirely,  as  if  it 


were  of  no  importance  whatsoever t  by  dealing  only  with  accuracy 

avoiding  the  study  of  cognitive  processes*  Studies  in- 
vestigating  the  clinician's  cognitive  processes,  on  the  other 
hnnd,  have  invariably  refrained  from  dealing  with  the  level  of 
accuracy  as  a  variable  of  importance.  What  is  called  for  is 
research  incorporating  both  the  study  of  accuracy  and  cognition 
r;nd  the  relationship  between  them. 

Cognitive  Functioning  of  the  Clinician 
On  the  whole  very  few  studies  have  been  done  with  an 
emphasis  on  looking  at  the  cognitive  processes  of  the  clinician 
while  engaged  in  a  clinical  task,  most  likely  because  of  the 
complexities  involved  in  trying  to  measure  what  is  going  on  in 
the  clinician's  head.  What  may  be  termed  reviews  of  the  existing 
literature  (Goldberg,  I96B;  Hammond,  1955?  Meehl,  i960)  point 
to  the  fact  that  although  clinicians  would  like  to  think  that 
their  decisions  are  based  on  complex  cognitive  methods  and  use 
of  configural  cue  relationships,  in  point  of  fact,  most  of  the 
variance  in  their  decisions  can  be  accounted  for  by  simple  linear 
relatioii.~.Mn<5,  To  arrive  at  this,  a  linear  multiple  regression 
formula  is  usually  developed  for  each  clinician  based  upon  his 
predictions  to  a  set  of  predictor  variables.  To  the  extent  that 
his  predictions  correspond  to  predictions  made  by  his  own 
regression  equation,  he  is  considered  linear.  This  method, 
although  seemingly  sophisticated,  does  not  really  take  into  aecount 
the  influences  on  accuracy  and  cognition  due  to  relationships 


between  cues  and  criterion,  between  cues  thecselves,  the  clin- 
ician's utilization  of  the  cues  and  limits  imposed  due  to 
statistical  properties  within  the  system)  nor  does  it  allow  the 
researcher  to  systematically  quantify  both  the  linear  and  non- 
linear components  of  cognitive  functioning.  These  limitations 
were  dealt  with  rather  ingeniously  by  Hursch,  Hammond  and 
Hursch  (19&0  and  Hammond,  Hursch  and  Todd  (196*0  through  the 
derivation  and  application  of  a  statistical  formula  to  analyze 
the  components  of  clinical  inference  by  means  of  multiple 
regression  analyses  utilizing  the  framework  of  Brunswik's  (195^) 
lens  model. 

Tucker  (196*0  has  reformulated  the  basic  "lens  model" 
equation  as  to  make  it  more  statistically  workable  and  it  is 
through  the  utilization  of  this  reformulated  equation  that  the 
cognitive  processes  of  the  clinician  can  be  more  easily  and 
validly  assessed. 

Level  of  Awareness 
There  is  wide  variation  in  opinion  regarding  the  degree 
to  which  clinicians,  and  people  in  general,  are  able  to  under- 
stand and  verbalize  the  raediational  process  of  inference 
behavior  when  making  judgments.  The  studies  that  are  available 
(Hoffman,  I960;  Oskamp,  1962a;  Todd,  195*0  add  credence  to  the 
opinion  that  the  Inference  process  is  relatively  inaccessible 
and  that  clinicians  generally  characterize  their  own  inference 
process  rather  poorly,  A  number  of  theorists  have  argued  that  the 


reason  for  these  results  is  that  self -awareness  has  been  viewed 
as  a  matter  of  either-or,  rather  than  more-or-less,  and  that  many 
studies  may  be  asking  too  much  of  the  clinician  to  consciously 
make  explicit  all  the  steps  involved  in  arriving  at  a  decision. 
They  feel  that  a  more  indirect  method  of  measuring  self -awareness 
would  be  necessary.  Chandler  (1970)  asked  judges,  who  were  engaged 
in  a  clinical  judgment  task,  to  subjectively  weight  the  available 
diagnostic  cues  (predictor  variables)  according  to  the  relative 
saliency  attached  to  each  one.  The  weights  were  then  used  to 
write  a  regression  equation  for  each  judge.  The  equation  was 
applied  and  the  predictions  were  correlated  with  the  judge's 
own  predictions  to  obtain  a  measure  of  the  level  of  awareness. 
Through  use  of  this  more  indirect  method,  a  substantial  degree 
of  awareness  was  found  for  each  judge. 

i  Confidence  and  Appropriateness 
Although  the  confidence  of  the  clinician  in  making  decisions 
or  predictions  has  never  been  a  topic  of  intense  study,  it  lias 
often  been  included  as  an  incidental  variable  of  interest  by  a 
number  of  experimenters  (Chandler,  19?0;  Goldberg,  1959. 
Oskamp,  1962b).  Most  of  the  evidence  seems  to  indicate  that  con- 
fidence is  an  inverse  function  of  amount  of  experience;  however, 
it  has  been  demonstrated  by  Little  (1961)  that  confidence  is  not 
task  related  and  appears  to  be  more  of  a  stable  personality  trait. 
It  would  therefore  be  much  more  meaningful  to  look  at  confidence 
in  relation  to  accuracy,  ie.  how  appropriate  is  the  confidence 


level  used.  Adams  (195?)  defined  an  equal-interval  scale  in 
terms  of  expected  percentages  of  success  at  various  confidence 
levels  and  derived  a  measure  of  "appropriateness".  Using  this 
measure,  Moxley  and  Satz  (19?0)  and  Oskamp  (1962b)  have  looked 
at  the  relationship  between  confidence  and  accuracy  and  have 
found  higher  levels  of  appropriateness  for  more  experienced 
judges. 

Experience  and  Expertise 
A  great  deal  of  controversy  has  been  centered  around  the 
variable  of  "experience  level"  in  clinical  prediction  research. 
Much  of  the  representative  literature  (Goldberg,  1959»  1968; 
Grebstein,  19635  Taft,  1955)  concludes  that  predictive  accuracy 
does  not  increase  significantly  with  amount  of  clinical  experience 
and  that  amount  of  professional  training  and/or  experience  does  not 
relate  to  accuracy.  Other  studies  have,  on  the  other  hand,  reached 
almost  opposite  conclusions,  Moxley  and  Satz  (1970),  for  instance, 
found  a  lower  proportion  of  correct  judgments  for  less  experienced 
judges  under  reduced  levels  of  information.  Oskamp  (1962b)  also 
found  that  accuracy  was  positively  and  significantly  related  to 
experience  level.  Upon  close  inspection  of  most  of  the  research, 
it  would  appear  that  contradictory  results  are  promulgated  by 
the  inappropriate  use  of  "experience  level",  through  the  lack 
of  consideration  of  the  nature  of  the  judgment  task,  the 
clinician's  training  and  the  relationship  between  the  two.  What 
is  called  for  is  a  more  systematic  differentiation  of  judges  on 
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the  "basis  of  task  specific  clinical  expertise  and  not  simply 
years  of  training  or  experience.  In  many  studies  all  professionals, 
or  all  those  at  the  doctoral  level  or  all  those  with  ?t  certain  num- 
ber of  years  of  experience  are  termed  "experienced"  or  "sxperts" 
without  any  consideration  of  the  above.  The  present  study 
will  attempt  to  empirically  divide  subjects  on  the  basis  of 
expertise  within  the  fabric  of  the  judgment  task,  as  well  as  on 
the  traditional  basis  of  years  of  clinical  experience*  Thus 
comparisons  of  variables  under  study  can  be  made  at  both  high 
and  low  levels  of  expertise  and  high  and  low  levels  of  experience. 
To  the  author's  knowledge,  this  is  the  first  attempt  to  empirically 
define  a  task  specific  level  of  expertise. 

In  summary,  therefore,  the  reported  research  represents 
a  bold  attempt  to  constructively  deal  with  the  methodological 
criticisms  of  Holt  (1958,  1970)  in  hopes  of  arriving  at  an 
accurate  comparison  of  clinical  and  actuarial  prediction. 
Furthermore,  the  reported  research  is  the  first  to  attempt 
an  explanation  of  the  accuracy  levels  attained  by  focusing 
on  the  cognitive  processes  of  the  clinician  and  thus  bridging 
the  gap  which  previously  existed  between  clinical  accuracy  and 
cognitive  functioning.  Level  of  awareness,  considered  a  dependent 
parameter  of  the  judgment  process,  is  studied  by  means  of  a 
more  indirect  method  of  assessment.  The  independent  parameters 
of  confidence  and  appropriateness  are  also  viewed  within  the 
01  animation  of  this  study. 
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Goals  of  the  Sti 
The  goals  cf  the  research  presented  are  as  follows i 

1.  To  carry  out  a  more  accurate  test  of  clinical  versus 
actuarial  prediction. 

2.  To  look  at  the  cognitive  processes  (linear  and  non-linear 
components)  of  the  clinician  en      in  a  clinical  decision 
ta?k  and  to  determine  the  relationship,  if  any,  of  cognitive 
functioning  to  level  of  accuracy. 

3.  To  empirically  differentiate  between  expert  and  non-expert 
clinicians  on  the  basis  of  the  specific  type  of  judgment 
task  utilized  and  test  for  differences  on  the  variables  of 
accuracy,  cognitive  functioning,  level  of  awareness,  con- 
fidence and  appropriateness. 

^.  To  differentiate  between  high  experience  and  low  experience 
clinicians  and  test  for  differences  on  the  variables  of 
accuracy,  cognitive  functioning,  level  of  awareness,  con- 
fidence and  appropriateness. 


CHAPTER  II 

METHODOLOGY 

Clinical  Task 
Of  primary  consideration  was  the  selection  of  a  task  that  was 
clinically  relevant  and  appropriate  and  on  which  clear-cut 
criterion  information  could  be  obtained.  The  clinical  task  util- 
ized was  the  prediction  of  length  of  stay  in  psychotherapy  from 
MMPI  profiles.  It  was  felt  that  such  a  task  was  clinically  rel- 
evant and  often  encountered  in  situations  demanding  a  selective 
treatment  decision  as  under  conditions  of  limited  treatment 
resources  and  high  patient  demand.  The  use  of  the  MMPI  as  a  predictor 
has  the  advantage  of  having  data  that  can  be  considered  both 
quantitative  and  clinical  in  nature.  Results  of  past  research 
(Mello  ic   Guthrie,  1958)  also  indicate  that  a  therapist  can  get 
some  indication  of  the  course  of  therapy  from  an  MMPI  attained 
on  intake. 

Clinical  Sample 
A  pool  of  318  MMPI  profiles  was  drawn  from  clients  seen  in 
a  university  mental  health  service  during  the  three  year  period 
of  196^-1967»  It  was  routine  practice  for  all  clients  being  sec  . 
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to  I     en  an  MMPI  and  thus  there  was  no  bias  on  this  factor. 
It  was  required  for  the  profile  to  he  drawn  that  the  record  be 
complete,  the  case  be  closed  and  the  criterion  information  be 
available.  Included  within  the  318  cases  were  a  number  of  clients 
who  were  seen  once,  given  another  appointment  and  failed  to 
return  as  well  as  a  few  who  began  treatment  only  to  drop  out 
premature  to  termination.  It  was  unknown  to  what  extent  these 
cases  were  discriminatively  influenced,  if  at  all,  by  situational 
or  extrapersonal  factors  or  to  what  extent  they  contaminated 
the  criterion.  Following  Holt's  (19?0)  advocacy  on  this  issue, 
these  cases  were  therefore  eliminated  leaving  222  cases  in  the 
clinical  sample.  It  was  also  felt  that  such  an  elimination  pro- 
cedure would  add  to  a  more  homogeneous  definition  of  the  cri  i 
length  of  stay  in  psychotherapy. 

Profiles  were  coded  either  short  term  (s)  or  long  term  (L) 
based  on  the  client's  length  of  stay  in  psychotherapy  at  the 
mental  health  service.  On  the  basis  of  a  bimodal  distribution, 
a  short  term  (S)  case  was  defined  as  four  or  less  therapy  sessic 
and  a  long  term  (L)  case  as  five  or  more  therapy  sessions.  The 
range  of  length  of  stay  for  the  S  group  was  1  to  4  sessions  with 
a  mean  of  3.12  sessions;  for  the  L  group,  the  range  was 
5  to  30  sessions  with  a  mean  of  12,25  sessions.  Two-thirds  of 
the  profiles  were  then  randomly  chosen  as  the  "standardisation" 
sample  with  the  remaining  one-third  to  be  used  as  the  "cross- 
validation"  sample.  The  base  rates  for  S  and  L  cases  in  each 
of  the  two  ramples  were  ,52  and  ,48  respectively,  A  comparison 
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between  mean  MMPI  scale  scores  on  the  standardization  and  cross- 
validation  samples  (Appendices  A  &  B)  yielded  no  significant 
differences  except  for  the  Mf  scale  with  the  long  term  cross- 
validation  cases  showing  a  significantly  (p<.05)  higher  moon 
than  the  long  term  standard izat ion  cases. 

Judges 
The  judges  consisted  of  14  clinical  psychologists,  6  of  whom 
were  at  the  Ph.D.  level  and  8  at  the  H.A.  or  M.S.  level.  All  judges 
were  affiliated  with  Harvard  Medical  School  and  were  either  on 
the  clinical  psychology  staff  or  were  clinical  psychology  interns 
at  a  Harvard  associated  teaching  hospital.  For  the  purposes  of 
this  study,  "expertise"  was  defined  empirically  by  one's  performance 
on  a  post  hoc  test  measuring  diagnostic  skill  with  the  MMPI. 
The  test  derived  consisted  of  matching  mean  MMPI  profiles  of  14 
Marks  and  Seeman  (I963)  code  types  with  22  possible  diagnoses 
assigned  to  these  code  types  by  this  Atlas.  Judges  were  asked  to 
assign  one  diagnosis  for  each  of  the  14  profiles  and  for  each 
profile  received  a  score  of  3,  2,  1  or  0  depending  on  whether 
the  diagnosis  assigned  was  the  first,  second  or  third  in 
frequency  associated  with  the  code  type  or  an  incorrect 
diagnosis  (Appendix  c).  Scores  for  each  judge  could  therefore 
range  from  a  mini  rum  of  0  to  a  maximum  of  42.  For  the  judges 
utilized  in  this  study,  scores  ranged  from  11  to  39  with  a 
mean  of  23.36.  Seven  judges  scored  above  the  overall  mean  and 
were  defined  as  "experts"  for  the  purpose  of  this  study;  the 
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remaining  seven  judges  scored  below  the  overall  mean  and  were 
defined  as  "non-experts".  The  range  and  mean  scores  on  this 
task  for  expert  and  non-expert  judges  are  given  in  Table  1. 

Judges  were  also  divided  by  the  traditional  manner  of  years 
of  clinical  experience.  Seven  judges  with  the  greatest  number  of 
years  of  clinical  experience  were  termed  "high  experience"  judges 
while  the  remaining  seven  judges  were  termed  "low  experience" 
judges.  Table  2  reports  the  range  and  mean  number  of  years  of 
clinical  experience  for  each  of  these  two  groups. 

TABLE  1 

RANGE  AND  MEAN  SCORES  ON  A  TEST  OF  MMPI  EXPERTISE 
FOR  EXPERT  AND  NON-EXPERT  JUDGES 


Range  Mean 


Expert  Clinical  Judges  24-39  30.15 

Non-Expert  Clinical  Judges  11-20  16.57 


TABLE  2 

RANGE  AND  MEAN  NUMBER  OF  YEARS  OF  CLINICAL  EXPERIENCE 
FOR  HIGH  AND  LOW  EXPERIENCE  LEVEL  JUDGES 


Range  Mean 


High  Experience  Clinical  Judges  6-16  12.50 

Low  Experience  Clinical  Judges  2-4  2.83 
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The      .atics  of  Cognitive  Functioning 

Tucker's  (19&0  reformulation  of  the  "lens  model"  equation 

of  Hammond,  Hursch  and  Todd  (19&+)  was  used  to  assess  the  linear 

and  non-linear  components  of  clinical  inference.  The  reformulated 

equation  is  as  follows t 


GR     R     +C\/l-R2     Vl-R 
e     s  *  e        ' 


where 


r  ■  the  validity  coefficient  of  the  judge:  the  correlation 

Ob 

between  the  actual  criterion  values  and  the  judge's 

predictions* 
G  ■  the  linear  component  of  judgmental  accuracy:  the 

correlation  between  predicted  scores  from  the  linear 

model  of  the  criterion  and  the  linear  model  of  the  judge, 
R  »  the  linear  predictability  of  the  criterion:  the  multiple 

correlation  between  cues  and  criterion  values, 
R  ■  the  linear  predictability  of  the  judge:  the  multiple 

5 

correlation  between  cues  and  judge's  predictions, 
C  ■  the  non-linear  component  of  judgmental  accuracy:  the 
correlation  between  the  variance  in  the  criterion 
system  and  the  variance  in  the  judgmental  system  which 
is  unaccounted  for  by  the  linear  component. 

In  the  notation  used,  the  subscript  e  refers  to  the  environment 

or  criterion,  and  s  to  the  subject  or  judge. 

The  Asse.ssmor.t  of  Self-Awareness 
The  specific  procedure  employed  to  measure  the  level  of 
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awarenesi  of  clinicians  was  that  employed  by  Chandler  (1970). 
li   require--  thai  Bach  judge  express  the  relative  importance  he 

attached  to  each  of  the  available  diagnostic  cues  (predictor 
variables)  by  distributing  100  points  among  them  in  a  way  which 
he  felt  reflected  the  contribution  each  cue  made  to  his  decisions. 
These  subjective  weights  then  formed  the  basis  of  a  regression 
equation  written  for  each  judge.  The  values  of  the  predictor 
variables  used  were  then  entered  into  this  regression  equation 
and  the  resulting  predictions  were  correlated  with  the  judge's 
own  predictions.  This  correlation  was  taken  as  a  measure  of  the 
judge's  level  of  awareness. 

Procedure 
All  judges  were  examined  on  an  Individual  basis.  The  purpose 
of  the  research  was  described  in  full  to  each  of  the  judges  and 
information  was  given  about  the  patient  population,  case  elim- 
inations, the  thirteen  MMPI  scales  and  the  criterion  measure  to 
be  predicted.  After  this,  the  standardization  sample  of  MMPI 
profiles  was  presented  to  each  judge  in  two  groups  with  the 
criterion  information  given  for  each  group.  One  group  consisted 
of  79  S  therapy  cases  while  the  other  consisted  of  69  L  therapy 
cases.  This  was  the  only  information  given  the  judge  concerning 
the  individual  cases.  Each  judge  was  given  a  maximum  of  one  hour 
to  look  over  the  profiles  of  each  of  the  criterion  groups  of 
the  standardization  sample  in  order  to  gain  knowledge,  later  to 
be  tested  on  a  new  sample  of  cases,  concerning  the  differentiation 
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and  L  therapy  patients  on  the       of  an  MMPI  profile.  The 
standardization  sample  of  cases  was  also  analyzed  actuarial] y 
or  statistically  by  means  of  a  discriminant  function  analysis 
with  the  three  validity  and  ten  clinical  scales  of  the  MMPI  as 
predictor  variables  and  the  dichotomous  variable  of  S  or  L  therapy 
stay  as  the  criterion.  Thus  case  information  given  to  both  the 
clinician  and  the  actuary  was  equated,  A  discriminant  function 
equation  was  computed. 

To  obtain  the  actuarial  predictions  for  comparison  with  the 
clinical  predictions,  the  values  of  the  thirteen  MMPI  scales  for 
each  case  in  the  cross-validation  sample  were  entered  into  ihe 
computed  discriminant  function  equation,  composite  discriminant 
scores  obtained ,  an  optimal  cutoff  score  determined  and 
predictions  made.  The  cross-validation  sample  included  7^4  MMPI 
profiles  of  which  39  were  S  therapy  cases  and  35  were  L  therapy 
cases  randomly  distributed  throughout  the  sample.  The  clinical  pre- 
dictions were  obtained  by  having  each  judge  make  his  judgment  con- 
cerning S  or  L  therapy  stay  on  the  ?k  cross-validation  cases 
immediately  following  familiarization  with  the  standardization 
sample.  In  addition,  judges  were  asked  to  indicate  the  level  of 
confidence  they  had  for  each  decision  in  terms  of  the  probability 
that  the  decision  was  correct.  This  was  done  on  an  11 -point  scale 
with  confidence  estimates  ranging  from  50  to  100  in  intervals  of  5, 

Following  the  judgments  for  the  cross-validation  cases,  each 
judge  was  asked  to  specify  the  way  in  which  h^  made  use  of  the 
predictor  variables  in  arriving  at  his  decisions  by  distributing 
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100  points  among  the  thirteen  MMPI  scales  in  a  manner  which 

indicated  the  relative  importance  attached  to  each  predictor  variable. 

The  test  for  expertise,  measuring  diagnostic  skill  with 
the  MMPI,  was  administered  approximately  one  week  after  admin- 
istration of  the  above  clinical  task.  At  that  time,  judges  were 
also  asked  to  indicate  the  number  of  years  of  clinical  experience 
(both  direct  and  indirect  patient  contact)  they  had  had,  with 
graduate  school  practicums  and  the  clinical  psychology  intern- 
ship counting  as  one  year  each. 


CHAPTER  III 


RESULTS 


Focus  on  Accuracy 
The  Actuary 

A  discriminant  function  analysis  for  the  standardization 
sample  was  computed  on  an  IBM  3&0  Computer  with  the  use  of  the 
UCLA  Biomedical  Computer  Program  BMD04M.  The  following  lambda 
values  for  each  variable  were  obtained:  L  0l_  .  «  0,00017), 
F  (?,  2  -  0.00022),  K  (A3  -  -0.000M),  Hs  (^  -  0.00002), 
D  (3-  -0.00023),  Hy  (?\6  ■=  0.00022),  Fd  (TV.  -  0. 00001 ), 
Mf  (A8  -  -0.00009),  Pa  (7v9  ■=  -0.00006),  Ft  (j\  lQ   -  -0.00016), 
Sc  (rlu  c  0,0002?),  Ka  (?v12  -  -0.00031)  and  Si  (J\13  c  -0,00006)o 
Mean  composite  discriminant  scores  (Z  scores)  for  each  criterion 
group  were  calculated,  where  the  S  group  was  defined  as  Population  A 
and  the  L  group  as  Population  B.  The  cutoff  score  for  each  case 
was  determined  thust 

If   Z£=>  (ZA  +  ZB)  /  2 

then  predict  short  term  group  (Population  A)0 

if    *{<:(«A  +  iB)/2 

then  predict  long  term  group  (Population  B)0 
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An  analysis  of  variance  of  the  discriminant  function  showed 
significant  differentiation  between  criterion  groups  on  the 
composite  Z  scores  (F  «  2.657,  p<%0l).  By  comparison  of 
predictions  to  the  actual  criterion  information,  the  hit  rate 
for  those  cases  on  which  the  discriminant  function  was  derived 
was  65  percent.  To  determine  the  efficiency  of  the  computed 
discriminant  function  equation  on  cross-validation,  the  values 
of  the  predictor  variables  were  entered  into  the  equation  for 
the  cross-validation  sample.  Mean  composite  discriminant  scores 
and  cutoff  points  for  this  new  sample  were  calculated  as  above. 
The  hit  rate  on  the  cross-validation  sample  was  ej\   percent. 
Tables  3  and  U,   give  a  breakdown  of  the  classification  by  the 
discriminant  function  equation  for  the  standardization  and 
cross-validation  samples, 

TABLE  3 

PREDICTIVE  CLASSIFICATION  BY  USE  OF  THE  DISCRIMINANT 
FUNCTION(DF)   FOR  THE  STANDARD] ZAT10N   SAMPLE 


Criterion 

Class 

iif ication 

Total  N 

Classification 

Classified  by 

by  DF 

S 

L 

DF 

S 

53 

26 

79 

L 

26 

43 

69 

Total  in  Class 

79 

69 

148 

X2- 


12.82,  df  -  1,    p<;001 
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TABLE  4 

PREDICTIVE  CLASSIFICATION  BY  USE  OF  THE  DISCRIMINANT 
FUNCTION (DF)  FOR  THE  CROSS-VALIDATION  SAMPLE 


Classification 
by  DF 

Criterion 

Classification 

Total  N 
Classified  by 
DF 

S 

L 

S 

L 

Total  in  Class 

24 
15 
39 

19 

16 

35 

43 
31 
74 

X2  -  0.40,  df  -  1, 

n.Sft 

The  Clinician 

All  clinical  judges  except  one  used  the  full  hour  alloted 
to  gain  familiarization  with  the  MMPI  profiles  in  the  standard- 
ization sample.  The  one  exception  decided  to  go  on  to  the 
judgment  task  after  only  twenty  minutes.  The  hit  rate  for  the 
judges'  predictions  on  the  cross-validation  sample  ranged  from 
46  to  71.6  percent  with  an  overall  mean  of  61.25  percent  and 
13  out  of  14  judges  exceeding  the  5^  percent  hit  rate  obtained 
by  the  actuarial  method. 

Process  of  Clinical  Judgment 
To  explore  the  cognitive  functioning  of  the  clinician, 
the  values  of  r  ,  0,  R  ,  R  and  C  were  computed  for  each 
judge  (Appendix  J))  resulting  in  separate  analyses  for  each 
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clinician  in  order  to  assess  cognitive  functioning.  The  validity 

coefficient  (r  )  was  obtained  by  means  of  a  tetrachoric 

s 

correlation  (Guilford,  19&5)  between  the  judge's  predictions 
and  the  actual  criterion  values  of  the  cross-validation  sample. 
To  obtain  the  linear  component  of  judgmental  accuracy  (g), 
discriminant  function  analyses  (BMD04M)  were  computed  on  the 
cross-validation  sample  using  the  actual  criterion  values  as 
the  criterion  and  again  using  each  judge's  predictions  as  the 
criterion.  A  tetrachoric  correlation  between  the  predicted  scores 
from  each  of  these  analyses  yielded  the  value  of  G.  A  multiple 
biserial  correlation  (Wherry,  19^-7)  between  the  cues  and  the 
criterion  values  of  the  cross-validation  sample  and  between  the 
cues  and  the  judge's  predictions  measured  the  linear  predictability 
of  the  criterion  (R  )  and  the  linear  predictability  of  each 
judge  (R  )  respectively.  The  non-linear  component  of  judg- 
mental  accuracy  (c)  was  calculated  as  follows t 

r  -  G  R  R 
a     e  s 


7  1  -  R  2  v/l-R2 
v      e   v      s 

A  Pearson  product-moment  correlation  between  hit  rate  or  accuracy 
level  and  the  amount  of  variance  accounted  for  by  the  non-linear 
component  yielded  a  correlation  coefficient  of  ,68  (p<;,0l).  The 
correlation  between  accuracy  level  and  the  amount  of  variance 
accounted  for  by  the  linear  component  (c)  fail*  I  to  reach 
simif icance.  Table  5  gives  a  summary  of  the  valuation  of  the 
mathematics  of  cognitive  functioning  for  the  most  accurate, 
typical  (mean)  and  least  accurate  judges. 
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TABLE  5 

MATHEMATICS  OF  COGNITIVE  FUNCTIONING  FOR  THE  MOST 
ACCURATE,   TYPICAL(MEAN)  AND  LEAST  ACCURATE  JUDGES 


a 


Judge 

r 
a 

G 

Rs 

C 

Most  Accurate 

.629 

.829 

.6?34 

.5322 

Typical  (Mean) 

.355 

,618 

.7382 

,2007 

Least  Accurate 

-.139 

-.454 

.5653 

-.0049 

R  was  not  included  in  this  table  for  it  had  a  constant  value 
throughout  of  ,5283, 

b 

Fisher's  Z  transformation  was  used  to  obtain  means. 


The  Index  of  Awareness 
Using  the  subjective  weights  provided  for  the  predictor 
variables  (MMPI  scale  scores),  a  linear  regression  equation 
was  written  for  each  judge.  The  values  of  the  predictor  var- 
iables for  the  cross-validation  sample  were  entered  into  this 
equation,  cutoff  scores  calculated  (as  above)  and  predictions 
made.  An  "index  of  awareness"  was  provided  by  the  correlation 
of  the  predictions  made  by  the  equation  with  the  actual  pre- 
dictions made  by  the  judge.  The  higher  this  index,  the  higher 
the  concordance  between  the  judge's  description  of  his  judgmental 
process  and  the  actual  inference  behavior.  Measures  for  level  of 
awareness  ranged  from  ,37  to  ,86  with  an  overall  mean  (Z  trans- 
formation) of  ,68,  A  product -moment  correlation  between  hit  rate 
and  level  of  awareness  yielded  a  coefficient  of  -.53  (n.s.). 
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Other  Findings 
Confidence  and  Appropriateness 

The  average  level  of  confidence,  the  judge's;  expected  per- 
centage of  correct  decisions,  for  all  juuges  was  6?, 4  with  the  mean 
confidence  for  each  judge  ranging  from  a  low  of  57 ,k   to  a  high  of 
78. y.  An  appropriateness  score  was  obtained  for  each  judge  by 
weighting  the  absolute  deviation  of  the  judge's  accuracy  score, 
at  each  confidence  level,  from  the  expected  percentage  of  correct 
decisions  by  the  number  of  judgments  made  at  that  confidence 
level,  summing  across  all  confidence  levels  and  dividing  by 
the  total  number  of  judgments  made.  Thus  if  a  judge  were  per- 
fectly appropriate,  his  accuracy  at  each  confidence  level  would 
match  the  expected  percentage  of  hits  and  the  absolute  deviations 
would  be  zero,  resulting  in  an  appropriateness  score  of  zero.  The 
higher  the  score,  the  less  appropriate  the  judge.  Scores  ranged 
from  7.03  to  19.28  with  an  overall  mean  of  13.33# 
Experience  and  Expertise 

Comparisons  were  made  between  expert  and  non-expert  judges 
and  between  high  experience  and  low  experience  judges  on  the 
following  variables t  hit  rate  or  accuracy  level,  the  non-linear 
component  of  judgmental  accuracy  (c),  the  linear  component  of 
judgmental  accuracy  (G),  level  of  awareness  (LA),  confidence 
and  appropriateness,  A  Fisher's  Z   transformation  was  used  on  the 
values  of  C,  G  and  LA  in  order  to  attain  a  normal  distribution 
for  each  variable.  Results  indicate  a  significant  difference  in 
accuracy  level,  non-linear  cognitive  functioning  (c)  end  level 
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of  awareness  when  differentiation  of  judges  is  on  the  basis  of 
expertise  but  not  when  on  the  basis  of  experience.  Expert  judges 
are  more  accurate,  are  more  non-linear  in  their  functioning  and 
have  a  lower  level  of  awareness  of  their  own  inference  bahavior 
than  non-expert  judges.  Although  there  were  no  significant  differ- 
ences on  any  of  these  variables  when  differentiation  was  made  by 
years  of  experience,  there  was  a  tendency  for  low  experience 
judges  to  be  more  accurate,  more  non-linear  and  more  confident 
of  their  decisions  than  high  experience  judges.  Results  of  the 
analyses  are  given  in  Tables  6  and.  7, 

TABLE  6 

COMPARISONS  OF  EXPERT  AND  NON-EXPERT  JUDGES 
ON  THE  VARIABLES  OF  INTEREST 


Variable 

Expert 
H 

Non-Expert 
M 

SEdiff 

t 

Accuracy 

.6*3 

.582 

.027 

2.28 

Non-Linear  Component  (c)a 

.3^0 

.067 

.088 

** 
3.09 

Linear  Component  (G)a 

.80'+ 

.640 

.209 

O.78 

Level  of  Awareness  (LA) 

.676 

.99^ 

.122 

-2.61* 

Confidence 

67.910 

66.900 

3.380 

0.30 

Appropriateness 

13.619 

13.050 

2.320 

0.30 

a 

transformed  scores, 
* 

P-^C.05,  for  a  two  tailed  test* 
** 

P-<C.01,  for  a  two  tailed  test. 
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TABLE  7 

COMPARISON  OF  HIGH  AND  LOW  EXPERIENCE  LEVEL 
JUDGES  ON  THE  VARIABLES  OF  INTEREST 


Variable 

High 
Experience 

M 

Low 
Experience 

M 

S!Wf 

t 

Accuracy 

.593 

.632 

.030 

-1.34 

Kon-Linear  Component  (c) 

.115 

.292 

.107 

-I.65 

Linear  Component  (G) 

.657 

.787 

.211 

-0.62 

Level  of  Awareness 

.895 

.775 

.1^9 

0.80 

Confidence 

#+.660 

70.150 

3.000 

-1.83 

Appropriateness 

12.177 

14,^91 

2.230 

-1.04 

a 

transformed  scores. 

CHAPTER  IV 


DISCUSSION 


The  Clinician  Versus  the  Actuary 
One  of  the  most  important  findings  of  the  research  reported 
here  was  that,  in  a  clinically  relevant  judgment  task,  13  out 
of  Ik  clinical  judges  surpassed  an  actuarial  (statistical)  method 
of  prediction,  This  is  the  only  finding  to  date  in  a  clinical 
judgment  study  that  is  so  overwhelmingly  in  favor  of  the  clin- 
ician. On*   of  the  major  purposes  of  this  research  as  it  was 
conceived  was  to  answer  Holt's  (1970)  call  for  a  "counterattack" 
against  the  so  many  existing  inadequate  clinical  judgment  studies 
by  "cleaning  up"  the  methodology  in  order  to  arrive  at  a  more 
accurate  test  of  clinical  versus  actuarial  prediction.  To  this  end, 
the  research  reported  called  for  simultaneous  attempts  to  pre- 
dict the  same  criterion  from  the  same  data  by  clinicians  and 
statisticians,  given  the  same  initial  information  and  substantially 
going  through  the  same  preliminary  steps.  This  is  the  first  study 
to  incorporate  such  an  "optimal"  design  as  outlined  by  Holt 
as  well  as  tne  first  study  to  utilize  the  very  essential  re- 
quirement of  cross-validation  of  both  the  clinical  and  actuarial 
prediction  systems.  The  remaining  criticisms  of  Holt  concerning 
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inadequate  criterion  measures,  misrepresentation  of  clinicians, 
inadequate  sample  size  and  unsuitable  clinical  data  were  a] 
considered  and  dealt  with  accordingly.  During  the  course  of 
the  research,  however,  it  became  evident  that  certain  particulars 
were  acting  either  against  the  clinician  or  the  actuary  and  it 
is  felt  that,  although  they  in  no  way  invalidate  the  results  of 
this  research,  they  demand  further  discussion. 

It  was  found  in  preparation  for  the  reported  research  that 
pilot  judges  who  made  predictions  on  a  similar  sample  of  cases, 
without  recourse  to  familiarization  with  the  standardization 
sample,  were  able  to  predict  length  of  therapy  stay  with  much 
higher  accuracy  than  the  present  judges.  For  5  pilot  judges, 
hit  rates  ranged  from  59  to  8k   percent.  This  fact,  combined 
with  the  verbal  reports  of  the  judges  in  the  present  study  on 
how  difficult  it  was  to  integrate  the  information  available 
in  the  standardization  sample,  raises  the  question  of  really  how 
much  systematic  learning  can  take  place.  No  doubt  each  clinician 
comes  to  the  judgment  task  with  a  certain  amount  of  knowledge 
and/or  preconceptions  concerning  the  relationship  of  cues  to 
criterion,  whether  valid  or  not.  Exposure  to  the  standardization 
sample  may  very  well  be  disruptive  to  the  "set"  brought  into 
the  situation,  cause  an  overload  on  the  system  or  lower  the 
judge's  sigr.al-to-noise  ratio  and  thus  negatively  effect  accuracy. 
This  lack  of  disruptive  input  data  in  the  pilot  study  group  could 
account  for  the  higher  level  of  accuracy  in  this  group.  The  fact 
that  the  most  accurate  judge  in  the  present  study  spent  the  least 
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unt  of  time  with  the  standardization  sample  (twenty  minutes!) 
would  seem  to  give  added  evidence  for  this  hypothesis. 

Although  the  same  initial  information  (KMPI  scale  scores 
and  criterion  values  for  the  standardization  sample)  was  given 
to  both  the  clinician  and  the  actuary,  how  this  information  was 
incorporated  and  used  would  be  different  for  both  by  virtue  of  the 
intrinsic  differences  between  them.  While  the  clinician  may  look 
over  every  profile  and  try  to  come  up  with  some  set  of  rules 
relating  cues  to  criterion,  the  actuary  or  computer  computes  a 
mean  profile  for  each  criterion  group  and  in  essence,  compares 
each  case  to  the  mean  or  composite  profiles.  To  the  degree  that 
the  mean  profiles  are  distinct  from  each  other,  they  may  represent 
a  more  powerful  input  and  a  less  disruptive  source  than  that 
which  the  clinicians  were  forced  to  utilize.  Giving  the  clin- 
ician the  mean  profiles  for  each  criterion  group  and  allowing 
him  to  study  these  way  make  the  input  data  less  disruptive  and 
equite  him  more  with  regard  to  the  actuary,  leading  to  an  even 
fairer  test  of  the  two. 

A  limitation  inherent  in  the  statistical  prediction  system 
is  the  "shrinkage"  of  the  validity  of  the  prediction  equation 
Hhen  applied  to  a  sample  other  than  the  sample  on  which  it  was 
derived,  The  derivation  of  the  prediction  equation  on  the 
standardization  sample  capitalizes  on  every  single  variation  in 
that  particular  sample  to  get  the  maximum  differentiation 

lible  between  criterion  groups.  To  the  extent  that  the  cross- 
validation  sample  even  minimally  differs  from  the  standardization 
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sample,  the  derive:!  predict} on       n  is  open  to  possible 
shrii     of  validity.  This  would  account  for  the  actuarial 
drop  in  hit  rate  from  65  to  5^  percent  in  going  from  the  standard- 
ization to  cross-validation  cases*  The  extent  to  which  shrinkage 
affects  the  clinicians  has  not  been  determined  but  if  one 
logically  assumes  that  shrinkage  is  present  for  the  clinical 
judges,  it  would  appear  that  they  are  not  so  much  at  the  mercy 
of  this  statistical  fact  as  their  actuarial  counterpart. 

The  statistical  prediction  system  was  also  faced  with  another 
limitation  in  that  the  equation  used  for  prediction  was  derived 
from  a  linear  regression-  analysis  and  therefore  placed  the 
restriction  of  linearity  upon  this  system.  The  question  remains 
whether  the  statistical  predictions  would  improve  by  means  of 
a  non-linear  prediction  equation  and  whether  this  improvement 
would  surpass  the  clinician.  The  ultimate  test  of  clinical 
versus  actuarial  prediction  would  have  to  incorporate  a  com- 
parison between  the  clinician  and  a  non-linear  prediction 
equation.  Such  a  study  is  now  being  planned  as  a  follow-up 
to  the  present  research, 

nltive  Proc e s s es 
As  mentioned  previously,  past  studies  on  clinical  judgment 
have  focused  either  on  the  accuracy  of  clinical  versus  actuarial 
predictions  or  on  the  cognitive  functioning  of  the  clinician 
and  have  treated  these  topics  independently  of  each  other.  The 
present  research  is  the  -first  to  study  both  accuracy  and  cognition 
and  attempt  an  explanation  of  the  accuracy  of  the  clinician  by  roans 
of  cognitive  functioning. 
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The  significant  relationship  between  the  non-linear  com- 
ponent of  judgmental  accuracy  (c)  and  the  hit  rate  suggests  that 
those  judges  who  were  most  accurate  were  also  the  most  non-linear 
in  their  approach  to  the  task.  It  can  he  seen  by  looking  at  raw 
score  values  that  most  judges,  whether  accurate  or  not,  had 

relatively  high  values  for  the  linear  indices  of  R  and  G, 

s 

Hammond  and  Summers  (1972)  have  termed  these  indices  "cognitive 
control"  and  "linear  knowledge"  respectively  and  see  them  as 
acting  independently  in  cognitive  tasks,  G  denotes  the  degree 
to  which  the  judge's  cognitive  system  can  mirror  the  linear 
task  system  while  R  measures  the  extent  to  which  this  know- 
ledge can  be  executed.  They  also  point  out  that  the  linear 
predictability  of  the  criterion  (R  )  sets  a  limit  in  the  linear 

system  on  the  extent  to  which  achievement  (r  )  mav  occur  and 

N  a7 

that  for  a  judge  to  increase  his  accuracy  beyond  this  statis- 
tical limit,  he  must  be  able  to  detect  and  correctly  utilise 
any  non-linearity  in  the  judgment  task.  This  is  exemplified 
in  the  present  study,  where  R  =  »5283,  by  looking  at  the  values 
of  these  cognitive  measures  for  the  most  accurate  and  least 
accurate  judges  (See  Table  5),  The  most  accurate  judge  is  not 
only  able  to  mirror  the  linear  task  system  to  a  substantial  degree 
and  thus  approach  the  limit  of  accuracy  set  by  this  system  but 
also  to  go  beyond  this  limit  through  utilization  of  a  significant 
amount  of  non-linearity.  The  least  accurate  judge,  on  the  other 
hand,  is  not  even  able  to  mirror  the  linear  system  nor  to  approach 
its  limit  of  accuracy. 
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Relatively  high  values  of  G  and  R  therefore  allow  judges 

s 

to  match  the  corresponding  accuracy  of  the  linear  system  up  to 
the  limits  set  by  R  ,  however  to  go  beyond  this,  non-linear 
components  must  be  utilized.  This  would  account  for  the  non- 
significant relationship  between  G  and  accuracy  level  as  well 
as  support  the  necessity  for  a  study  comparing  clinical  and 
non-linear  statistical  predictions  as  mentioned  above. 

The  definition  of  the  term  C  offers  a  rather  unique 
problem  that  should  be  mentioned.  Throughout  this  study,  C  has 
been  referred  to  as  the  non-linear  component  of  judgmental 
accuracy  following  the  definitions  of  Hammond,  Hursch  and  Todd 
(1964)  and  Tucker  (1964).  Implied  in  this  definition  is  that 
it  is  a  measure  (C~)  of  the  amount  of  variance  accounted  for 
by  non-linearity,  however,  close  inspection  of  the  mathematical 
formulas  reveals  that  it  is  rather  a  measure  of  the  amount  of 
variance  not  accounted  for  by  linearity.  Thus  non-linear, 
configural  and  error  variance  is  lumped  together,  Hammond, 
Hursch  and  Todd  (1964)  contend  that  C  can  become  large  if  and 
only  if  there  is  some  systematic  non-linear  variance  in  the 
system  but  this  would  still  leave  the  task  of  determining 
how  much  is  non-linear  (for  instance  as  presented  by  a  U  shaped 
curve)  and  how  much  is  configural  or  even  more  complex.  The 
answer  to  this  question  most  likely  lies  in  further  sophistication 
of  the  formulas  which  is  beyond  the  scope  of  this  research. 
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Awareness  of  the  Clinician 
Chandler  (19?0)  found  much  higher  indices  of  awareness  for 
clinical  judges  than  were  found  in  the  present  study.  This  may 
well  be  due  to  the  lower  levels  of  awareness  observed  for  the 
expert  judges  as  compared  to  the  non-expert  judges.  It  seems 
that  these  results  can  be  explained  by  referral  to  the  findings 
that  expert  judges  are  significantly  more  accurate  and  more  non- 
linear than  non-expert  judges.  As  pointed  out  above,  to  increase 
accuracy  in  the  judgment  task,  the  judge  must  draw  on  and  utilize 
non-linear  cues  accounting  for  the  positive  and  significant 
relationship  between  accuracy  and  non-linearity.  The  method  used 
to  measure  level  of  awareness  forces  the  judge  to  describe  his 
judgmental  processes  in  strictly  linear  terms  which  would  be 
more  difficult  the  more  non-linear  his  approach.  This  would 
also  account  for  a  few  of  the  judges'  verbal  reports  stating 
that  this  task  was,  "impossible  to  do,"  "the  most  difficult 
part  of  all"  and  that  it  "doesn't  do  justice  to  the  way  I  used 
the  scale  scores,"  The  negative,  although  non-significant, 
relationship  between  level  of  awareness  and  accuracy  can 
also  be  explained  in  this  manner.  The  low  levels  of  awareness 
attained  therefore  appear  to  be  due  more  to  methodological 
limitations  than  anything  else. 

Other  Variables 
Confidence  and  Appropriateness 

The  variables  of  confidence  and  appropriateness  were  very 


disappointing  in  their  results  for  no  significant  differences 
Here  found  between  expert  and  non-rvpert  nor  between  high  and 
low  experience  judges.  The  reason  for  this  may  be  determined  to 
some  degree  by  the  fact  that  appropriateness  scores  seem  to  be 
somewhat  higher  overall,  indicating  lower  appropriateness, 
for  judges  in  this  study  than  for  comparable  groups  in  other 
research  (Oskamp,  i962b|  Moxley  &   Satz,  1970).  This  means  that 
the  judges  in  this  study  were  more  confident  than  their  accuracy 
would  allow  and  this  may  be  an  artifact  of  the  population  from 
which  the  judges  were  drawn.  This  would  also  add  confirmation 
to  Little's  (1961)  argument  that  confidence  is  really  a  stable 
personality  trait. 
Experience  and  Expertise 

The  task  to  empirically  determine  clinical  expertise  was 
devised  to  take  into  consideration  the  nature  of  the  judgment 
task,  the  judge's  clinical  training  and  the  relationship  between 
the  two.  Although  criticisms  might  be  raised  concerning  the 
diagnostic  categories  used  or  the  actual  skills  measured,  it 
is  felt  that  such  an  empirical  differentiation  of  judges  is 
much  more  meaningful  and  appropriate  than  simply  a  division 
on  the  basis  of  years  of  clinical  experience.  Significant 
differences  between  expert  and  non-expert  judges  on  accuracy, 
non-linearity  (c)  and  level  of  awareness  would  have  gone  un- 
detected had  judges  been  divided  solely  by  the  traditional 
manner  of  years  of  experience. 
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APPENDIX  A 


TABLE  8 


MEAN   SCALE  SCORES  ON  THE  MHPI  FOR  SHORT 
TERM  THERAPY  CASES  J    STANDARDIZATION 
AND  CROSS-VALIDATION   SAMPLES 


Scale 

Standardization  Sample 
n=79 

Cross-Validation  Sample 
n-39 

M 

SD 

Fi 

SD 

L 

48.19 

10.21 

48.82 

9.62 

F 

58.63 

10.11 

56.79 

10.55 

K 

54.59 

8.78 

54.05 

8.00 

l(Hs) 

56.11 

11.65 

55.21 

12.05 

2(D) 

64.68 

12.93 

62.28 

13o67 

3(Hy) 

63.28 

15.37 

63.90 

15.91 

ft(Pd) 

66.08 

8.41 

65.31 

9.24 

5(Mf) 

51.61 

11.73 

51.62 

10.81 

6(Pa) 

59c?6 

12.41 

60.85 

11.76 

?(Pt) 

63.84 

12.44 

63.79 

12.54 

8(Sc) 

66.54 

13.10 

6U.38 

12.92 

9(Ma) 

60.35 

10.11 

61.41 

10.60 

0(Si) 

55.28 

12.09 

53.44 

11,52 
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APPENDIX  B 


TABLE  9 


MEAN   SCALE  SCORES  ON  THE  KMPI  FOR  LONG 
TERM  THERAPY  CASES:    STANDARDIZATION 
AND  CROSS-VALIDATION  SAMPLES 


Standardization  Sample 
n-69 

Cross-Validation  Sample 
n=35 

Scale 

M 

1 

SD 

M 

SD 

L 

46,49 

9.74 

45.71 

9.43 

F 

63.;+5 

11.29 

63.11 

10.88 

K 

51.42 

8.36 

48.89 

9.04 

l(Hs) 

56.70 

12.17 

57.40 

11.91 

2(D) 

70.  B7 

14.71 

72.34 

14.20 

3(Hy) 

63.09 

16.01 

67.91 

16.24 

fc(Fd) 

68.72 

9.77 

69.66 

8.81 

5(Mf) 

55.12 

10.62 

59.91 

10.72 

6(Fa) 

62.51 

13.19 

65.31 

12.11 

7(Pt) 

69. 81 

12.77 

72.91 

12.91 

8(Sc) 

71.70 

12.40 

72.61 

12.63 

9(Ka) 

64.52 

13.02 

61.49 

12.89 

0(Si) 

60.19 

13.13 

58.51 

12.60 
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APPENDIX  C 


TABL!;  10 


DIAGNOSES  AND  SCORES  FOR  CODE  TYPES  USED 
IN  THE  TASK  TO  DETERMINE  EXPERTISE 


MMPI  Code  Diagnosis  Score 

Type 


Psychoneurotic  Depression/Anxiety  Reaction  3 

2-7      Psychotic  Depression  2 

Chronic  Brain  Disorder  1 

Psychoneurotic  Depression  3 

2-7-4      Passive-Aggressive  Personality  2 

Mixed  Psychosis  1 

Schizophrenic  3 

2-7-8     Anxiety  Reaction/Obsessive-Compulsive  Neurosis   2 

Schizoid  Personality/Acute  Brain  Disorder  1 

Schizophrenic,  schizo-affective  3 

2-8     Acute  Brain  Disorder  2 

Mixed  psychoneurosis  1 

Conversion  Reaction/Psychophysiologic  Reaction   3 

3-1      Mixed  Psychosis  2 

Dependent  Personality/Acute  Brain  Disorder  1 

Psychophysiologic  Reaction  3 

3-2-1      Psychotic  Depression  2 

Passive-Apgressive  Personality  1 

Schizophrenic,  paranoid  3 

4-6     Mixed  Personality  Disorder  2 

Passive-Aggressive  Personality  3 

4-6-2      Mixed  Psychoneurosis  2 

Mixed  Psychosis  1 
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TABLE  10 
"continued" 


1  Code  Diagnosis  Score 

Type 

Schizophrenic,  paranoid  3 

4-8-2     Sociopathic  Personality  2 

Mixed  Psychoneurosis  1 

Sociopathic  Personality  3 

4-9      Mixed  Psychosis  2 

Schizophrenic/Manic-Depressive  Psychosis  3 

8-3      Dissociative  Reaction/Mixed  Psychoneurosis  2 

Schizoid  Personality  1 

Schizophrenic,  paranoid  3 

8-6     Paranoid  Personality  2 

Chronic  Brain  Disorder  1 

Schizophrenic,  mixed  3 

8-9      Psychoneurotic  Depression  2 

Acute  Brain  Disorder  1 

Schizophrenic,  paranoid  3 

9-6     Mixed  Psychoneurosis  2 

Sociopathic  Personality  1 
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